Diagnostics of speech recognition using classification phoneme diagnostic trees

نویسندگان

  • Milos Cernak
  • Christian Wellekens
چکیده

More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of our diagnostics is to go beyond standard evaluation in terms of WERs and confusion matrices, and to look at the recognized output in more details. During the decoding phase, some specific data are collected at the decoder as possible causes of errors, and later are statistically analyzed using classification and regression trees. Focusing on pure acoustic phone decoding without language modeling, we present and discuss the results of the diagnostics that is used for an analysis of impact of intrinsic speech variabilities on speech recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

بهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگی‌های استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز

The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...

متن کامل

Error Analysis Using Decision Trees in Spontaneous Presentation Speech Recognition

This paper proposes the use of decision trees for analyzing errors in spontaneous presentation speech recognition. The trees are designed to predict whether a word or a phoneme can be correctly recognized or not, using word or phoneme attributes as inputs. The trees are constructed using training “cases” by choosing questions about attributes step by step according to the gain ratio criterion. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006